The Impact of Aging, Dementia, and SES on Cognitive Decline
A Longitudinal Study Using Linear Mixed Models
Adasia M., Bess T., Preethi R.
Introduction
Linear Mixed Models (LMMs) are powerful statistical tools designed to analyze data with complex structures, such as hierarchical data (e.g., individuals within groups) or repeated measures (e.g., assessments taken over time). Unlike traditional methods, LMMs account for variations at both the group level and the individual level, making them ideal for studying patterns of cognitive decline across aging populations.
Why LMMs?
Fixed Effects: Capture overall population trends, like how age or socioeconomic status influences cognitive function on average.
Random Effects: Model differences between individuals, accounting for unique trajectories over time or variability across participants.
Data Flexibility: Handle missing data and unbalanced datasets effectively, ensuring reliable results even when some observations are incomplete.
Literature Review
Linear Mixed Models: An Overview
Extend simple linear regression by incorporating fixed effects (population-level) and random effects (subject-level variability).
Ideal for analyzing longitudinal and hierarchical datasets.
Provide unbiased estimates even with missing data (Bates, 2014; Gelman & Hill, 2007).
Covariance Structure
Essential for modeling dependencies between repeated measures (Starkweather, 2010).
Positive covariance: variables move together.
Negative covariance: variables move in opposite directions.
Allows explicit modeling of within-subject variability, critical for longitudinal studies.
Literature Review Continued
Robust Estimation
Methods that minimize the influence of outliers, ensuring parameter reliability (Agostinelli & Yohai, 2016).
Non-robust techniques like OLS regression are highly sensitive to extreme values and may yield biased results.
Challenges of Traditional Methods
Traditional linear models assume independence of observations, often violated in clustered data (Barr et al., 2013).
Handling missing data through listwise deletion or imputation can introduce bias (Enders, 2010).
Methods
Our Focus
This project applies LMMs to investigate:
The influence of age, dementia status, and socioeconomic factors on cognitive decline.
Cognitive function measured using the Mini-Mental State Examination (MMSE), a widely-used tool for assessing cognitive impairment.
How repeated assessments over time help us understand individual changes in cognitive function while capturing broader population trends.
Methods
Dataset: OASIS-Longitudinal MRI Data in Nondemented and Demented Older Adults
The Role of Age in Predicting Cognitive Decline: An Analysis Using MMSE Scores
By: Adasia McClinton
Data Cleaning and Ingestion
data <-read.csv("C:/IDC6940/IDC6940_BDP/oasis_longitudinal.csv")library(mice)imputed_data <-mice(data, m =5, method ='pmm')
iter imp variable
1 1 SES MMSE
1 2 SES MMSE
1 3 SES MMSE
1 4 SES MMSE
1 5 SES MMSE
2 1 SES MMSE
2 2 SES MMSE
2 3 SES MMSE
2 4 SES MMSE
2 5 SES MMSE
3 1 SES MMSE
3 2 SES MMSE
3 3 SES MMSE
3 4 SES MMSE
3 5 SES MMSE
4 1 SES MMSE
4 2 SES MMSE
4 3 SES MMSE
4 4 SES MMSE
4 5 SES MMSE
5 1 SES MMSE
5 2 SES MMSE
5 3 SES MMSE
5 4 SES MMSE
5 5 SES MMSE
Age Coefficient: -0.031 (small, non-significant decline with age).
Key Findings
Age alone is not a significant predictor of MMSE scores.
Indicates additional variables may explain cognitive decline.
Model 0: Age as the Sole Predictor
R Code
# Model 0: Age as the sole predictorlibrary("lme4")library("mitml")library("Matrix")model0 <-lapply(imputed_list, function(data) lmer(MMSE ~ Age + (1| Subject.ID), data = data))pooled_results <-testEstimates(model0, method ="D2")summary(pooled_results)
Call:
testEstimates(model = model0, method = "D2")
Final parameter estimates and inferences obtained from 5 imputed data sets.
Estimate Std.Error t.value df P(>|t|) RIV FMI
(Intercept) 29.804 2.445 12.189 1.353e+05 0.000 0.005 0.005
Age -0.034 0.032 -1.070 1.286e+05 0.285 0.006 0.006
Unadjusted hypothesis test as appropriate in larger samples.
Results: The age effect on MMSE was small and non-significant, suggesting that age alone does not fully explain cognitive decline.
Model 1: Adding Dementia Severity and SES
Examine the combined effect of age, dementia severity (CDR), and SES
Age and SES are not significant predictors in this model.
Model 1: Adding Dementia Severity and SES
R Code
model1 <-lapply(imputed_list, function(data) {lmer(MMSE ~ Age * Group + SES + (1| Subject.ID), data = data)})pooled_results <-testEstimates(model1, method ="D2")summary(pooled_results)
Call:
testEstimates(model = model1, method = "D2")
Final parameter estimates and inferences obtained from 5 imputed data sets.
Estimate Std.Error t.value df P(>|t|) RIV FMI
(Intercept) 34.723 6.443 5.389 1.123e+07 0.000 0.001 0.001
Age -0.070 0.081 -0.873 7.274e+07 0.383 0.000 0.000
GroupDemented -4.374 7.261 -0.602 4.318e+05 0.547 0.003 0.003
GroupNondemented -4.255 6.925 -0.614 1.191e+08 0.539 0.000 0.000
SES -0.309 0.193 -1.598 1.538e+03 0.110 0.054 0.052
Age:GroupDemented 0.007 0.092 0.071 3.489e+05 0.943 0.003 0.003
Age:GroupNondemented 0.063 0.087 0.725 7.480e+07 0.468 0.000 0.000
Unadjusted hypothesis test as appropriate in larger samples.
Results: This model showed no significant effects for age, SES, or age-by-group interactions, indicating that additional factors may better explain cognitive decline.
Call:
testEstimates(model = final_model, method = "D2")
Final parameter estimates and inferences obtained from 5 imputed data sets.
Estimate Std.Error t.value df P(>|t|) RIV FMI
(Intercept) 17.292 3.646 4.742 10710.621 0.000 0.020 0.020
CDR -5.166 0.470 -10.984 759.456 0.000 0.078 0.075
nWBV 15.783 4.922 3.207 12205.185 0.001 0.018 0.018
Unadjusted hypothesis test as appropriate in larger samples.
Results: CDR was significantly associated with MMSE scores, showing that higher dementia severity leads to lower cognitive function. Additionally, higher nWBV was associated with higher MMSE, supporting its role in cognitive preservation.
Conclusions
Key Takeaways
Age alone does not predict cognitive decline.
Dementia severity (CDR) is a strong negative predictor.
Higher brain volumes (nWBV) are protective against cognitive decline.
Brain Volume: The Role of Age and Socioeconomic Status
nWBV_{ij} : Normalized whole brain volume for subject (i) at time (j).
\beta_0 : Overall intercept (fixed effect).
\beta_1 : Fixed effect of Age.
\beta_2 : Fixed effect of SES.
\mu_{i} : Random intercept for each subject.
\epsilon_{ij} : Residual error term.
METHODOLOGY
The most common method used in fitting linear mixed models are:
1. Maximum Likelihood Estimation (MLE):
it determines the parameters under which the observed data is most probable.
estimates both the fixed effects (population-level parameters) and variance components (random effects and residual variances) by maximizing the likelihood of the observed data.
2. Restricted Maximum Likelihood Estimation (REML) :
maximizes the likelihood of the data after adjusting for the fixed effects, focusing on variance components estimation.
less biased because it adjusts for the loss of degrees of freedom caused by estimating fixed effects.
#install.packages("lme4")library(lme4)oasis_data$gender <- oasis_data$'M/F'oasis_data$SubjectID <- oasis_data$'Subject ID'# Fit the linear mixed model with only Age as the predictor, remlmodel <-lmer(nWBV ~ Age + (1| SubjectID), data = oasis_data)summary(model)
Linear mixed model fit by REML ['lmerMod']
Formula: nWBV ~ Age + (1 | SubjectID)
Data: oasis_data
REML criterion at convergence: -1852.4
Scaled residuals:
Min 1Q Median 3Q Max
-4.0245 -0.4668 0.0150 0.4481 3.6450
Random effects:
Groups Name Variance Std.Dev.
SubjectID (Intercept) 1.009e-03 0.031772
Residual 6.971e-05 0.008349
Number of obs: 354, groups: SubjectID, 142
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.9977560 0.0171965 58.02
Age -0.0034837 0.0002208 -15.78
Correlation of Fixed Effects:
(Intr)
Age -0.988
equation and interpretation of age as the only predictor
the linear mixed equation is:
\text{nWBV}{ij} = 0.9978 - 0.0035 \cdot \text{Age}{ij} + \mu_{i} + \epsilon_{ij}
where:
• 0.9978 is the fixed intercept, representing the average baseline nWBV when Age is 0.
• -0.0035 is the fixed effect estimate for Age, indicating that for each additional year of age, the nWBV decreases by approximately 0.0035 units on average.
INTERPRETATION:
age has a significant negative effect on nWBV
t-value for Age = -15.78
this shows a strong association between age and decreasing nWBV.
2.Age and Socio-Economic status as predictors
lmer function uses REML unless specified otherwise.
# Fit Linear Mixed Model with Age and Socio Economic status as the predictorsmodel <-lmer(nWBV ~ Age + SES + (1| SubjectID), data = oasis_data)summary(model)
Linear mixed model fit by REML ['lmerMod']
Formula: nWBV ~ Age + SES + (1 | SubjectID)
Data: oasis_data
REML criterion at convergence: -1842.3
Scaled residuals:
Min 1Q Median 3Q Max
-4.0215 -0.4692 0.0140 0.4432 3.6478
Random effects:
Groups Name Variance Std.Dev.
SubjectID (Intercept) 0.0010160 0.031874
Residual 0.0000697 0.008349
Number of obs: 354, groups: SubjectID, 142
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.9956305 0.0183229 54.338
Age -0.0034846 0.0002211 -15.761
SES 0.0008823 0.0024137 0.366
Correlation of Fixed Effects:
(Intr) Age
Age -0.933
SES -0.342 0.015
equation and interpreation of age and ses:
The model equation:
\text{nWBV}{ij} = 0.9956 - 0.0035 \cdot \text{Age}{ij} + 0.0009 \cdot \text{SES}{ij} + \mu_{i} + \epsilon_{ij}
where:
• 0.9956 is the intercept, representing the estimated nWBV when both Age and SES are 0.
• -0.0035 is the coefficient for Age, indicating that each additional year of age is associated with an average decrease in nWBV by approximately 0.0035 units.
• 0.0009 is the coefficient for SES, suggesting that for each unit increase in SES, there is a slight positive association with nWBV, though it is not statistically significant (t-value = 0.366).
INTERPRETATION
Age
strong, statistically significant negative effect on nWBV
consistent with prior findings (t-value of -15.761)
SES
small positive effect on nWBV as t-value = 0.366
SES may not contribute meaningfully to explaination of variation in nWBV in this model.
Maximum Likelihood (predictors: age and ses)
R script for MLE :
# Fit the model with ML for model comparisonmodel_ml <-lmer(nWBV ~ Age + SES + (1| SubjectID), data = oasis_data, REML =FALSE)summary(model_ml)
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: nWBV ~ Age + SES + (1 | SubjectID)
Data: oasis_data
AIC BIC logLik deviance df.resid
-1867.5 -1848.2 938.8 -1877.5 349
Scaled residuals:
Min 1Q Median 3Q Max
-4.0300 -0.4722 0.0137 0.4444 3.6503
Random effects:
Groups Name Variance Std.Dev.
SubjectID (Intercept) 9.979e-04 0.031589
Residual 6.953e-05 0.008338
Number of obs: 354, groups: SubjectID, 142
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.9952308 0.0182348 54.579
Age -0.0034795 0.0002202 -15.804
SES 0.0008834 0.0023926 0.369
Correlation of Fixed Effects:
(Intr) Age
Age -0.933
SES -0.341 0.015
Interpretation for Maximum Likelihood:
Random Effects:
(Intercept) Variance: 0.0009979 (SD: 0.0316) This is the variability in nWBV across different subjects.
Residual Variance: 0.00006953 (SD: 0.00834) This is the remaining variance in nWBV after accounting for both fixed and random effects.
Fixed Effects:
Intercept: 0.9952. likely represents an estimated baseline close to 1.
Age: -0.00348. nWBV decreases with increasing age.
for each one-unit increase in Age, nWBV is expected to decrease by about 0.00348 units, holding SES constant.
t value: -15.804 (a high magnitude t-value) - Age is statistically significant
SES: 0.00088. This positive coefficient suggests that as SES increases, there’s a very slight increase in nWBV.
low t-value (0.369), SES may not have a statistically significant effect on nWBV in this model.
INTERPRETATION
Age is a significant predictor of nWBV,
with a negative effect indicating cognitive decline as age increases.
SES appears to have little to no significant effect on nWBV based on this model.
Comparison of the two methods:
• Fixed Effects: The estimates for Age and SES under REML are nearly identical to those under ML, showing consistent results.
• Age has a significant negative impact on nWBV, indicating cognitive decline with age.
• SES appears to have little to no significant effect on nWBV.
In summary,
REML is appropriate for final model interpretation as it provides the better estimates for variance components,
while both methods confirm Age as a key predictor of nWBV decline.
#install.packages("lme4")library(lme4)oasis_data$Group <-factor(oasis_data$Group)oasis_data$SubjectID <- oasis_data$'Subject ID'# Fit the linear mixed model with only Dementia Status as the predictor, remlmodel <-lmer(CDR ~ Group + (1| SubjectID), data = oasis_data)summary(model)
Linear mixed model fit by REML ['lmerMod']
Formula: CDR ~ Group + (1 | SubjectID)
Data: oasis_data
REML criterion at convergence: -141.5
Scaled residuals:
Min 1Q Median 3Q Max
-2.2557 -0.4745 -0.0123 -0.0079 5.3656
Random effects:
Groups Name Variance Std.Dev.
SubjectID (Intercept) 0.01641 0.1281
Residual 0.02604 0.1614
Number of obs: 354, groups: SubjectID, 142
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.25057 0.04372 5.732
GroupDemented 0.42252 0.04914 8.597
GroupNondemented -0.24607 0.04778 -5.150
Correlation of Fixed Effects:
(Intr) GrpDmn
GroupDemntd -0.890
GropNndmntd -0.915 0.814
equation and interpretation of Dementia Status as the only predictor
the linear mixed equation is:
\text{CDR}{ij} = 0.2506 + 0.4225 \cdot \text{Demented}{ij} - 0.2461 \cdot \text{Nondemented}{ij} + \mu_{i} + \epsilon_{ij}
where:
• If the group is “Demented”, the coefficient for Group Demented is used.
• If the group is “Nondemented”, the coefficient for Group Nondemented is used.
• If the group is the baseline (which is typically “Other” or the group not specifically named in the output), then no extra term is added for Group.
INTERPRETATION:
• The coefficient for Group Demented (0.4225) indicates that, on average, individuals in the “Demented” group have a higher CDR score by 0.4225 compared to the baseline group.
• The coefficient for Group Nondemented (-0.2461) indicates that, on average, individuals in the “Nondemented” group have a lower CDR score by 0.2461 compared to the baseline group.
• t-value for Demented = 8.597
• t-value for Nondemented = -5.150
• This shows a strong association between Dementia Status and CDR.
2.Age and Socio-Economic status as predictors
lmer function uses REML unless specified otherwise.
# Fit Linear Mixed Model with Age and Socio Economic status as the predictorsmodel <-lmer(CDR ~ Group + EDUC + (1| SubjectID), data = oasis_data)summary(model)
Linear mixed model fit by REML ['lmerMod']
Formula: CDR ~ Group + EDUC + (1 | SubjectID)
Data: oasis_data
REML criterion at convergence: -134.6
Scaled residuals:
Min 1Q Median 3Q Max
-2.2818 -0.4040 -0.0276 0.0290 5.3745
Random effects:
Groups Name Variance Std.Dev.
SubjectID (Intercept) 0.01626 0.1275
Residual 0.02603 0.1614
Number of obs: 354, groups: SubjectID, 142
Fixed effects:
Estimate Std. Error t value
(Intercept) 0.149869 0.086623 1.730
GroupDemented 0.431773 0.049489 8.725
GroupNondemented -0.245532 0.047645 -5.153
EDUC 0.006606 0.004910 1.345
Correlation of Fixed Effects:
(Intr) GrpDmn GrpNnd
GroupDemntd -0.564
GropNndmntd -0.468 0.807
EDUC -0.864 0.139 0.009
equation and interpreation of Dementia Status and Education Level:
• The coefficient for Group Demented (0.4318) indicates that, on average, individuals in the “Demented” group have a higher CDR score by 0.4318 compared to the baseline group.
• The coefficient for Group Nondemented (-0.2455) indicates that, on average, individuals in the “Nondemented” group have a lower CDR score by 0.2455 compared to the baseline group.
• The coefficient for Education Level (0.0066), for each additional year of education, a subject’s CDR score increases by 0.00661 units on average.
• t-value for Demented = 8.725
• t-value for Nondemented = -5.153
• t-value for EDUC = 1.345
Conclusion
• Dementia Status (Group) is strongly associated with CDR.
• Education (EDUC) has a small, positive association with CDR, but this effect is not statistically significant (the t-value of 1.345 is below the typical threshold for significance).
References
Agostinelli, C., & Yohai, V. J. (2016). Composite robust estimators for linear mixed models. Journal of the American Statistical Association, 111(516), 1764-1774. https://doi.org/10.1080/01621459.2015.1115358
Bates, D. (2014). Fitting linear mixed-effects models using lme4. arXiv preprint arXiv:1406.5823.
Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390-412.
Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255-278.